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(54) Method of encoding digital audio signals 

(57) The method of encoding digital data in the 
present invention enables change of minimum limit of 
audibility characteristics and/or masking characteristics, 
which are usually set on the basis of the aural-psycho- 
logical characteristics of persons with typical hearing, 

FIG.1 



thus changing the allocation of quantized bits to each 
frequency band and allowing selection of a sound qual- 
ity which accords with the listener's hearing. The 
present invention is suitable for ATRAC, a method for 
compressed encoding for mini-discs. 
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Description 

FIELD OF THE INVENTION 

The present invention relates to a method of encod- 
ing digital data in which, when recording musical tones, 
sounds, etc. in recording media such as mini-discs, bits 
are allocated to the spectrum of each frequency band in 
response to the musical tones, sounds, etc. so as to 
compress data volume. 

BACKGROUND OF THE INVENTION 

One method of highly efficient compressed encod- 
ing of digital data such as musical tones and sounds ts 
ATRAC (Adaptive Transform Acoustic Coding), used in 
mini discs. In ATRAC, since the digital data is com- 
pressed with high efficiency, it is first broken down into a 
plurality of frequency bands, then divided into blocks in 
accordance with time units of variable length, trans- 
formed into spectral signals by MDCT (Modified Dis- 
crete Cosine Transform) processing, and then each 
spectral signal is encoded by the number of quantized 
bits which have been allocated to it taking into account 
aural-psychological characteristics. 

Among the aural-psychological characteristics 
which can be applied to the compressed encoding are 
loudness-level characteristics and masking effect 
Loudness-levei characteristics show that even with the 
same sound pressure level, the loudness of a sound 
sensed by a person changes according to the frequency 
of the sound. Accordingly, this shows that the minimum 
limit of audibility, which shows the smallest loudness 
which can be heard by a person, changes according to 
the frequency. As for masking effect, there are two 
kinds: simultaneous masking effect and elapsed mask- 
ing effect. Simultaneous masking effect is a phenome- 
non in which, when several sounds of different 
frequency composition occur simultaneously, one 
sound makes another difficult to hear. Elapsed masking 
effect is a phenomenon in which the masking occurs 
before and after a loud sound along the time axis of the 
loud sound. 

An example of conventional art which makes use of 
the elapsed masking effect is Japanese Unexamined 
Patent Publication No. 5-91061/1993. In this conven- 
tional art when a transient signal is included in one of 
the frequency conversion time units, bits are aRocated in 
accordance with a word length which varies depending 
on the energy of previous time units and on the amount 
of masking, thereby preventing a sound quality deterio- 
ration called "pre-echo." Again, Japanese Unexamined 
Patent Publication No. 5-248972/1993 proposes a tech- 
nique for improving the efficiency of encoding by using 
elapsed masking in reference to the spectral distribution 
of previous time units. 

Another example of bit allocation using the aural- 
psychological characteristics is one called the repetition 



method, in which actual bit allocation suited to input dig- 
ital data is performed as follows. First the power S of 
each frequency band, and the masking threshold M of 
that power S on the other frequency bands, are found. 

5 Next from the masking threshold M and the power of 
quantized noise N(n) (when each frequency band is 
quantized into n bits), is calculated the ratio of the mask- 
ing threshold to noise, being MNR(n) = M/N(n) . Then, 
after bit allocation for the frequency band with the smafl- 

io est ratio of masking threshold to noise MNR(n), that 
ratio of masking threshold to noise MNR(n) is re-calcu- 
lated, and bits are allocated to the frequency band with 
the lowest ratio. 

Note that the aural characteristics of persons with 

is typical aural characteristics are the model for the mini- 
mum limit of audibility, masking threshold, etc. men- 
tioned above. Accordingly, there are cases where 
listeners will feel a sense of incongruity due to differ- 
ences in hearing or preference. 

20 For example, in cases where the spectral composi- 
tion of the input digital data is comparatively flat, tike 
white noise, bit allocation will be made with the masking 
threshold at the minimum limit of audbilrty, so most of 
the quantized bits will be allocated to the mid- to low- 

25 range. Accordingly, depending on the size of the spec- 
tral composition, quantized bits may not be allocated to 
the uftra-tow and ultra-high ranges, giving some listen- 
ers a sense of incongruity. 

Again, when the input digital data is a composite 

30 wave composed of a signal with a narrow spectrum 
band (such as a sine wave signal) and white noise, the 
frequency bands f 1 which include the sine wave signal 
will have more power, but as for frequency bands f2 
which are far from the frequency bands f1, the farther 

35 from the frequency bands f1, the greater the drop in 
power. Accordingly, there will be almost no masking 
from the sine wave signal at a frequency band f2, and 
the influence of masking from the power of the fre- 
quency band f2 itself ts increased. Because of this, 

40 there will be no great difference between the ratio of sig- 
nal to masking threshold (SMR: the ratio of a frequency 
band's own power S to masking threshold M) at the fre- 
quency bands f1 and the same ratio SMR at the fre- 
quency bands f2. 

45 In other words, if the power of a signal is S, and the 
power of quantized noise is N(n) when each frequency 
band is quantized into n bits, then, based on the relative 
relationship between the two, the ratio of masking 
threshold to noise MNR(n) = M/N(n) - S/N(n))/(S/M(n)) 

so will be approximately the same value at the frequency 
bands f1 and f2 Accordingly, since the conventional 
adaptive bit allocation methods perform bit allocation 
based only on the ratio of masking threshold to noise 
MNR(n), their drawback is that approximately the same 

55 number of bits are allocated to the frequency bands f1 
and f2. 

As a result, if there are many frequency bands f2 
which are not influenced by the masking from the sine 
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wave signal, the number of bits allocated to the fre- 
quency bands f1 which include the sine wave signal 
becomes relatively smaller, the quantization enor of the 
sine wave signal becomes greater, and sound quality 
deteriorates. 

In regard to this point, the present Applicant has 
proposed, in Japanese Unexamined Patent Publication 
7-202823/1995, a structure which automatically limits 
the number of bits which may be allocated to frequency 
bands with low power S. However, a drawback of this 
conventional art is that, since the maximum number of 
bits which may be allocated to each frequency band is 
determined on the basis of its power, when the power of 
white noise is large, there are cases when no limitation 
on bit allocation to that frequency band is made. 

SUMMARY OF THE INVENTION 

One object of the present invention is to provide a 
method of encoding digital data capable of attaining a 
sound quality which accords with the listener's hearing. 

Another object of the present invention is to provide 
a method of encoding digital data capable of preventing 
deterioration of sound quality even of signals with nar- 
row spectrum bands. 

In order to realize the first object mentioned above, 
the first method of encoding digital data of the present 
invention encodes digital data such as musical tones 
and sounds by converting it into frequency domains, 
dividing the converted spectra into a plurality of fre- 
quency bands, changing a minimum limit of audibility 
characteristic so as to set a masking threshold, and allo- 
cating quantized bits for each frequency band in accord- 
ance with ratios of masking threshold to noise which are 
found for each frequency band in accordance with 
power or energy of each frequency band in considera- 
tion of aural-psychological characteristics. 

The above structure, by enabling change of the 
minimum limit of audibility characteristic among aural- 
psychological characteristics, frees aural-psychological 
characteristics from definition by the characteristics of 
persons with typical hearing, and makes possible selec- 
tion of whether or not to allocate bits to spectra with 
small inaudble domains, or spectra with ultra-low or 
urtra-high domains. Accordingly, it becomes possible to 
respond to persons with superior hearing or to individ- 
ual, subjective preference, and sound quality which 
accords with listeners' hearing can be attained. 

Next, in order to realize the first object mentioned 
above, the second method of encoding digital data of 
the present invention encodes digital data such as musi- 
cal tones and sounds by converting it into frequency 
domains, divkSng the converted spectra into a plurality 
of frequency bands, changing a masking characteristic 
so as to set a masking threshold, and allocating quan- 
tized bits for each frequency band in accordance with 
ratios of the masking threshold to noise for each fre- 
- quency band which are found in accordance with power 



or energy of each frequency band in consideration of 
aural-psychological characteristics. 

The above structure, by enabling change of the 
masking characteristic among the aural-psychological 
5 characteristics, frees aural-psychological characteris- 
tics from definition by the characteristics of persons with 
typical hearing, and makes possible selection of 
whether to allocate bits, for example, to spectra which, 
for example, suffer masking in a critical band. Accord- 
ingly. it becomes possible to respond to persons with 
superior hearing or to individual, subjective preference, 
and sound quality which accords with listeners' hearing 
can be attained. 

Next, in order to realize the first object mentioned 
15 above, the third method of encoding digital data of the 
present invention encodes digital data such as musical 
tones and sounds by converting it into frequency 
domains, dividing the converted spectra into a plurality 
of frequency bands, and switching among (i) bit alloca- 
te tion in accordance with ratios of masking threshold to 
noise which are found for each frequency band in 
accordance with power or energy of each frequency 
band in consideration of aural-psychological character- 
istics, (ii) bit allocation in accordance with a represento- 
rs tive value of the power or the energy of each frequency 
band, and (iii) bit allocation giving weight to each of the 
foregoing bit allocation methods. 

With respect to data, such as white noise having a 
spectral composition which is comparatively flat the 
30 above structure makes possible bit allocation which is 
flat along the frequency axis. Again, with respect to 
data, such as sine wave signals, with narrow band 
width, the above structure makes possible bit allocation 
which emphasizes the signal with narrow band width. 
35 Accordingly, selection of a sound quality which is suited 
to the source of the musical tone is made possfole. 

Finally, the fourth method of encocfing digital data of 
the present invention, in order to realize the second 
object mentioned above, switches among bit allocation 
40 methods (i), (n), and (Hi) descrfoed in the third method of 
encoding digital data in accordance with a relationship 
between the masking threshold and peaks and local 
peaks found based on differences in power or energy 
between adjacent spectra within each frequency band. 
45 The above structure makes it possible to automati- 
cally allocate bits according to the method most suited 
to the digital data, whether it is white noise or other data 
with wide band width, or sine wave signals or other data 
with narrow band width, thus preventing deterioration of 
so sound quality, even with musical tones not suited to bit 
allocation using simultaneous masking such as the 
masking threshold/noise ratio. 

The other objects, features, and superior points of 
the present invention will be made clear by the descrip- 
55 tion below. Further, the advantages of this invention will 
be evident from the following explanation in reference to 
the Figures. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 is a frequency spectrum diagram for 
explaining the method of encoding according to the first 
embodiment of the present invention. 

Figure 2 is a block diagram showing the electrical 
structure of a mini-disc recording and reproduction 
device, which is one example of application of the 
present invention. 

Figure 3 is a flow-chart for explaining the bit-alloca- 
tion method according to the first embodiment of the 
present invention. 

Figure 4 is a flow-chart for explaining the bit-alloca- 
tion method according to the second embodiment of the 
present invention. 

Figure 5 is a flow-chart for explaining the bit-alloca- 
tion method according to the third embodiment of the 
present invention. 

Figure 6 is a frequency spectrum diagram for 
explaining operations for detection of peaks and local 
peaks in the bit-allocation method shown in Figure 5. 

DESCRIPTION OF THE EMBODIMENTS 

The first embodiment of the present invention will 
be explained below, in reference to Figures 1 through 3. 

Figure 1 is a frequency spectrum diagram for 
explaining the method of encoding digital data accord- 
ing to the first embodiment of the present invention, and 
Figure 2 is a block diagram showing the electrical struc- 
ture of a mini-disc recording and reproduction device 1 , 
which is one example of application of the present 
invention. First, in reference to Figure 2, the mini-disc 
recording and reproduction device 1 will be explained. 
First digital data, for example in the form of fight signals, 
is serially inputted to an input terminal 2 from a digital 
audio signal source (not shown) such as a compact disc 
reproduction device or a satellite broadcast receiver. 
After the light signals are converted into electric signals 
by a photoelectric element 3. they are sent to a digital 
PLL circuit 4. The digital PLL circuit 4 extracts the clock 
from the digital data, and recreates multibit data corre- 
spond ng to the sampling frequency and the number of 
quantized bits. Next in a frequency conversion circuit 5, 
the multibit data undergoes sampling rate conversion to 
the 44.1 kHz conforming to the mini-disc standard from, 
for example, the 44.1 kHz sampling frequency of com- 
pact discs, the 48 kHz sampling frequency of digital 
audio tape recorders, a the 32 kHz sampling frequency 
of satellite broadcasts (A mode), and is then sent to an 
audio compression circuit 6. 

The audio compression circuit 6 performs com- 
pressed encoding of the input data according to the 
foregoing ATRAC method. The encoded audio data is 
sent through a shock-proof memory controller 7 to a sig- 
nal processing circuit 8. A shock-proof memory 9 is pro- 
vided in association with the shock-proof memory 
controller 7. In addition to absorbing the difference in 



transfer rates between the audio data outputted from 
the audio compression circuit 6 and the audio data 
inputted to the signal processing circuit 8, the shock- 
proof memory 7 also serves to protect the audio data by 

5 interpolation of any breaks which occur in the playback 
signal due to disturbance such as vibration during the 
playback operation, which will be discussed below. 

The signal processing circuit 8 functions as an 
encoder and decoder, and encodes the audio data as 

io magnetic modulation signals before sending it to a head 
driving circuit 11. The head driving circuit 11 moves a 
recording head 1 2 to the desired recording location on a 
magneto-optical disc 13, and causes the recording 
head 12 to emit a magnetic field corresponding to the 

75 magnetic modulation signals. At this time, laser light is 
projected from an optical pickup 21 onto the desired 
recording location on the magneto-optical disc 1 3, and a 
magnetized pattern corresponding to the magnetic field 
emitted by the recording head 1 2 is formed on the mag- 

20 neto-optical disc 13. 

During the playback operation, on the other hand, 
serial signals corresponding to the magnetized pattern 
on the magneto-optical disc 13 are reproduced by the 
optical pickup 21 , and after the serial signals thus repro- 

25 duced are amplified by a high-frequency (RF) amplifier 

22. they are sent to the signal processing circuit 8 and 
decoded into audio data. After the shock-proof memory 
controller 7 and the shock-proof memory 9 have elimi- 
nated the influence of any disturbance on the decoded 

30 audio data, they are sent to an audio expansion circuit 

23. The audio expansion circuit 23 performs a conver- 
sion process which is the reverse of compressed encod- 
ing according to the ATRAC method, and demodulates 
the audio data into full-bit digital audio signals. The 

35 demodulated digital audio signals are converted into 
analog audio signals by a digital/analog (D/A) conver- 
sion circuit 24, and are then outputted from an output 
terminal 25. 

The serial signals amplified by the high-frequency 
40 amplifier 22 are also sent to a servo circuit 31. In 
response to the serial signals which have been repro- 
duced, the servo circuit 31 exerts feedback control on 
the revolution speed of a spin motor 33 through a driver 
circuit 32, thus enabling reproduction at the desired lin- 
45 ear velocity. The servo circuit 31 also exerts feedback 
control on the revolution speed of a feed motor 34, thus 
enabling control of the position of the optical pidop 21 
in the radial direction of the magneto-optical disc 13, 
i.e., control of tracking. Finally, the servo circuit 31 also 
so exerts feedback control on the focusing of the optical 
pfckn>21. 

The servo circuit 31 . the optical pickup 21 , the high- 
frequency amplifier 22, the signal processing circuit 8. 
and the driver circuit 32 are energized by a power 
55 ON/OFF circuit 35. The power ON/OFF operations of 
the power ON/OFF circuit 35 and the signal processing 
operations of the signal processing circuit which will be 
discussed below, are centrally managed by a system 
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control microcorrputer 36. In association with the sys- 
tem control microcomputer 36 is provided an input oper- 
ation means, which enables sound-quality selection 
operations, which will be discussed below, as well as 
song title input, song selection operations, etc. 

Next, the bit allocation method in the first embodi- 
ment ol the present invention, which is performed 
according to the ATRAC method by the audio compres- 
sion circuit 6 of the mini-disc recording and reproduction 
device 1 structured as described above, will be 
explained, referring to Figures 1 and 3. 

In the ATRAC method, the audio data sampled at 
44.1 kHz, as mentioned above, is divided into certain 
frequency bands, specifically a Low frequency band 
from 0 kHz to 5.5 kHz, a Middle frequency band from 
5.5 kHz to 1 1 kHz, and a High frequency band from 1 1 
kHz to 22 kHz, and the audio data bridging certain time 
frames for each divided frequency band is converted, by 
means of the MDCT processing, into an MDCT coeffi- 
cient which is the data of one frequency domain. The 
MDCT coefficients converted in this manner are then 
converted into spectrum powers Si for i number of fre- 
quency bands (i = 1 . 2 I, with I equal to, for example. 

25). Processing like that shown in Figure 3 is then car- 
ried out to allocate quantized bits in accordance with 
each spectrum power Si thus obtained. 

The audio compression circuit 6 includes a table 
ROM 6a, and in the table ROM 6a are stored masking 
characteristics and/or minimum limit of audibility char- 
acteristics accorcfing to the ATRAC method. These min- 
imum limit of audtoflity characteristics appear as a curve 
shown by reference symbols a1 , a2, a3, and a4 on Fig- 
ure 1. The masking characteristics, calculated in 
accordance with the spectrum powers Si, a critical band 
width of each frequency band, etc., appear, for a power 
distrftxition like that shown in Figure 1, for example, as 
a curve shown by reference symbols a11, a12, and 
a13. The minimum limit of audibility characteristics 
shown by the reference symbols al through a4 and 
masking characteristics shown by reference symbols 
<x11 through a13 are prepared in accordance with the 
aural-psychological characteristics of persons with typi- 
cal hearing characteristics, and are fixed characteris- 
tics. 

However, in the first embodiment of the present 
invention, the minimum limit of audbitity and/or the 
masking characteristics can be changed. In concrete 
terms, for example in the case of the masking character- 
istics, the greater the spectrum power and the higher 
the frequency, the larger the range of masking of other 
frequency bands. In the example in Figure 1 . the maxi- 
mum limit Smax of the range influenced by spectrum 
power S5, which is a peak power, is shown by 
a 1 3 x (1 ±Ik) . Here. Ik is a coefficient for weighting, tf 
a plurality of variables k are stored in advance in the 
table ROM 6a, and the variables k are witched by 
means of a register 36a in the system control microcom- 
puter 36, the masking characteristic curve <x13 can be 



changed within the range from a14 through a15. The 
variable k can be set by the listener through the input 
operating means 37. 

For example, by changing the masking characteris- 

5 tic curve from a13 to a1 4, the band masked is widened, 
the level of masking is increased, and the number of bits 
allocated to signals with tow power is decreased, or 
even eliminated. Accordingly, bit allocation to signals of 
relatively greater power is increased, and the dynamic 

w range of the high-power signals is increased. If, on the 
other hand, the masking characteristic curve is changed 
from <x13 to a15, bit allocation to low-power signals is 
increased, and bit allocation to signals of relatively 
greater power is decreased. Accordingly, the frequency 

is range can be enlarged. The same effect can also be 
obtained by giving the masking characteristic curve a13 
an offset instead of weighting. 

In the same way, with regard to the minimum limit of 
audibility characteristics, the minimum limit of audibility 

so characteristic curve a1 through a4, which is based on 
the aural-psychological characteristics of persons with 
typical hearing characteristics, can be weighted or given 
an offset thereby changing the a4 portion of the curve, 
for example, as shown by reference symbol o5. In this 

25 way, relatively more bits are allocated to the high-fre- 
quency bands. 

Next, processing for allocation of quantized bits will 
be explained, referring to Figure 3. First in Step p1, the 
spectrum power Si of each frequency band is calculated 

30 from the sum of squares of the MDCT coefficients for 
that frequency band (which are obtained by means of 
the MDCT processing). In Step p2, the aucfio compres- 
sion circuit 6 selects, through the register 36a of the 
system control microcomputer 36, parameters for 

35 change of masking characteristics, such as the varia- 
bles k, which are stored in the table ROM 6a. In Step p3, 
in the same way as in Step p2, parameters for change 
of the minimum limit of audibility characteristics are 
selected. 

40 In Step p4, reference masking characteristics and 
minimum limit of audibility characteristics prwtously cal- 
culated and stored in the table ROM 6a are changed in 
accordance with the parameters selected in Steps p2 
and p3, and these two characteristics are synthesized in 

45 order to determine a final masking threshold. In other 
words, if the minimum limit of audibS'rty characteristic 
curve thus changed is as shown by the reference sym- 
bols a1, a2, a3, a5, and the masking characteristic 
curve thus changed is as shown by the reference sym- 

so bds a11, a12, a14, the curve of the final masking 
threshold obtained by synthesis will be as shown by the 
reference symbols a1 , a12, a 14, a3. a5. 

In Step p5, if the index of each frequency band is i, 
the ratio of the frequency band's spectrum power Si 

55 (calculated in Step p1) to its masking threshold Mi (cal- 
culated in Step p4) SMRi = Si/Mi is calculated for all 
frequency bands. On a logarithmic graph, the ratio 
SMRi for each frequency band wil correspond to that 



5 



9 EP 0 855 805 A2 10 



part of the length of the spectrum power Si which 
exceeds the masking threshold Mi. 

Next, in Step p6. the ratio of spectrum power Si to 
the power of quantized noise Ni(n), when the spectrum 
power Si of each frequency band is quantized into n 
bits, is calculated: SNR(n) = Si/Ni(n) . Statistically, the 
ratio SNR(n) is a constant in accordance with the char- 
acteristics of the signal, so it may be calculated in 
advance by statistical processing. From the ratio of the 
ratio SNR(n) to the ratio SMRi can be calculated the 
ratio of masking threshold to the power of quantized 
noise, being MNRi(n) = SNRi(n)/SMRi . 

In Step p7, the quantized bits are allocated to each 
frequency band as follows. The number of bits n is 
increased from 0, and, at each increase, the ratio of 
masking threshold to power of quantized noise MNRi(n) 
is calculated for each frequency band, and a bit is allo- 
cated to the frequency band where the ratio MNRi(n) is 
the smallest In this way, each time the number of quan- 
tized bits n is increased, a bit is allocated to the fre- 
quency band with the smallest ratio MNRi(n), and if this 
is repeated until allocation of all available bits is com- 
pleted, the word length of each frequency band is deter- 
mined, and this is outputted. In other words, bits are 
allocated starting with the frequency band in which the 
length of that part of the spectrum power Si exceeding 
the threshold Mi is longest. 

Thus, bits are allocated in such a way that the 
masking threshold, as shown in Figure 1 . is changed to 
accord with the listener's preference. 

The foregoing has described the case of change of 
both the masking characteristics and the minimum Imrt 
of audibility characteristics, but the present invention is 
not limited to such a case; either the masking character- 
istics or the minimum-audibility characteristics may be 
changed alone. 

In short, change of the minimum limit of audibility 
characteristics alone, for example, makes rt possfcle to 
select whether or not to allocate bits to small spectra in 
the inaudfcte range or spectra in the ultra-low or ultra- 
high ranges. Again, change of the masking characteris- 
tics only, since it entails change of masking characteris- 
tics which are determined by the critical bands in 
accordance with the power and the frequency of each 
frequency band, makes it possible to select whether or 
not to allocate bits to spectra which are masked by 
spectra with comparatively high power. In this way, 
sound quality which accords with the hearing of each 
listener can be obtained. 

The second embodiment of the present invention 
will be explained below, in reference to Figure 4. 

Figure 4 is a flow chart for explaining the bit alloca- 
tion method in the second embodiment of the present 
invention. The notable feature of this bit allocation 
method is that it is possWe to set a desired percentage 
x between (a) the bit allocation according to the ratio of 
masking threshold to quantized noise MNRi(n) and (b) 
the bit allocation according to the power of quantized 



noise SNi(n), when the spectrum power Si. which is a 
representative value of the power or energy of each fre- 
quency band, is quantized into n bits. Several of the per- 
centages x of (a) to (b) are stored in advance in the 

5 table ROM 6a of the audio compression circuit 6, and 
selection among the different percentages x can be per- 
formed, through the register 36a of the system control 
microcomputer 36, in response to the operations from 
the input operating means 37. 

10 In concrete terms, first, in Step p11, in the same 
way as in Step pi in the first embodiment the spectrum 
power Si of each frequency band is calculated from the 
sum of squares of the respective MDCT coefficients. In 
Step p12 f the value in the register 36a of the system 

15 control microcomputer 36 is read, and the correspond- 
ing percentage x% is selected from the table ROM 6a. 

If the percentage x determined in this way is 0, i.e., 
when the number of bits B1 available for the first alloca- 
tion is 0, then the bit allocation according to the ratio of 

20 masking threshold to quantized noise will not be per- 
formed, and the processing proceeds directly to Step 
p18, which will be discussed below. In contrast, if the 
number of bits B1 available for the first allocation is not 
0, then Step p13 is carried out. 

25 In Step p1 3, given a total number of mire-disc audio 
spectrum data bits B0 (1.144 to 1,464 bits), the number 
of bits B1 available lor the first allocation in accordance 
with the ratio MNRi(n) is calculated: B1 = B0 * (x/100) . 
In Step p14, in accordance with previously-calcu- 

30 lated masking characteristics and minimum limit of audi- 
bflity characteristics corresponding to the aural- 
psychological characteristics of persons with typical 
hearing, a masking threshold, i.e.. the curve <x1. a12, 
a13, a3, a4, is calculated. Then, in Steps p15 and p16, 

35 as in Steps p5 and p6 above, the ratio of masking 
threshold to power of quantized noise MNRi(n) for each 
frequency band is calculated from the ratio SMRi of the 
frequency band's spectrum power Si to its masking 
threshold Mi. In Step pi 7, bit allocation is performed in 

40 the same way as in Step p7 above, but the total number 
of bits allocated in Step p17 is the-number of bits avail- 
able for the first allocation B1 , as calculated in Step p1 2 
above. 

In Step p18, the power of quantized noise SNi(n) is 
45 calculated, and in Step p1 9, bits are allocated to the fre- 
quency band with the highest power of quantized noise 
SNi(n). Thereafter, the power of quantized noise SNi(n) 
is re-calculated, bits are allocated to the band where 
this value is highest and this is repeated until all bits 
so available for the second allocation B2 «= B0 (1- (x/1 00)) 
have been allocated. Steps pi 8 and p1 9 are carried out 
when the number of bits available for the first allocation 
is 0 and the processing has proceeded directly from 
Stepp12 to Step p1 8. or when xdoes not equal 100. but 
55 when x does equal 100. Le.. when B2 = 0. the word- 
length is outputted cfirectiy after Step p17. 

In cases when the input signal is a composite wave 
of a sine wave signal and white noise, and in other 
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cases where tt resembles a single sine wave, for exam- 
ple with a solo piano piece, if the bit allocation is per- 
formed only according to the ratio of masking threshold 
to quantized noise MNRi(n), marry bits will be allocated 
to noise elements with low power, and the error in quan- 
tizing of the piano becomes relatively great. However, rf 
the bit allocation percentage x can be changed as out- 
lined above, the bit allocation according to the power of 
quantized noise SNi(n) is carried out in addition to that 
according to the ratio of masking threshold to quantized 
noise MNRi(n), thereby ensuring that the number of bits 
allocated to the piano can be increased, the error in 
quantizing of the piano is reduced. 

Again, if the input signal is composed of sound with 
many local peaks and noise, for example an orchestra 
piece, the bit allocation can be performed in accordance 
with the ratio of masking threshold to quantized noise 
MNRi(n), in which the noise and the musical tones com- 
posing small local peaks in bands close to large signals 
can be masked, thus allocating no bits to them, and 
more bits can be allocated to large signals which are not 
masked. This enables high fidelity recording. 

Further, with input signals lying between the forego- 
ing two examples, which are composed of a musical 
tone with three or four local peaks and noise, for exam- 
ple a solo clarinet piece, by giving weight both to the bit 
allocation according to the ratio of masking threshold to 
quantized noise MNRi(n) and to the bit allocation 
according to the power of quantized noise SNi(n), fidel- 
ity of the clarinet can be improved. 

In this way, the bit allocation method most suited to 
any musical tone source can be selected. 

The third embodiment of the present invention will 
next be explained, in reference to Figures 5 and 6. 

Figure 5 is a ffow chart for explaining the bit alloca- 
tion method in the third embodiment of the present 
invention. The notable feature of this bit allocation 
method is that the percentage x of (a) the bit allocation 
according to the ratio of masking threshold to quantized 
noise MNRi(n) to (b) the bit allocation according to the 
power of quantized noise SNi(n) is automatically deter- 
mined on the basis of the relationship between (1) 
peaks and local peaks in spectrum powers Si and (2) 
masking thresholds. 

First the peak value among the spectrum powers 
of all frequency bands from S1 to SI. such as that shown 
by reference symbol S5 on Figure 6, is found. Then a 
masking threshold, such as that shown on Figure 6, 
which includes masking characteristics due to that peak 
level, is found. Next local peaks such as that shown by 
reference symbol S8 on Figure 6 are found for each fre- 
quency band. The number of such local peaks masked 
by the masking threshold, and the number of such local 
peaks not so masked, are respectively found, and the 
ratio between masked local peaks and unmasked focal 
peaks determines the percentage x. 

In other words, if the total number of focal peals is 
NM. and the number of masked local peaks is M, then: 



M/(NM+1)=0 (1) 

Accordingly, if there are no masked local peaks, the per- 
centage x will be 0%, and the number of bits available 
5 for the first allocation B1 will be set at 0, If. on the other 
hand, 

0<M/(NM+1)<0.5 (2) 

w then x is from 50% to 90%, and if 

0.5< M/(NM+1) (3) 

then x is 100%, and the number of bits available for the 
15 first allocation B1 will be the entirety of total available 
bits B0. 

Here the detection of local peaks and the selection 
of the percentage x will be discussed. The focal peaks 
are found for all the frequency bands after the peak 

so spectrum power (S5 in Figure 6) is found. In the exam- 
ple in Figure 6, the difference D34, D45 D89 and its 

polarity between each spectrum power S3 to S9 within 
a certain number of frequency bands from peak value 
S5 (in the example in Figure 6. two frequency bands on 

25 the low-frequency side and four on the high-frequency 
side) is found, and the focal peaks are detected on the 
basis of change in polarity and the absolute value of 
those differences. In this way, the focal peaks are found 
across all frequency bands. In concrete terms, in the 

30 case of Figure 6, there is only one local peak (S8), and 
that focal peak is masked by the masking threshold; 
thus M/(NM+1)c 1/(1+1) = 0.5. Accordingly, equation 
(2) above will be applied, and a percentage x = 50% to 
90% will be selected. 

35 Next, the bit allocation method of the third embodi- 
ment will be explained in reference to Figure 5. 

In this bit allocation method, after the spectrum 
power Si of each frequency band has been calculated in 
Step p2l as in Step p11 and Step p1 above, the peak 

40 value is found in Step p22, and the masking threshold 
including the masking characteristics of that peak value 
is found in Step p23. In Step p24, the percentage x is 
calculated by means of equations (1) through (3) above, 
and the number of bits available for the first allocation 

45 B1 is calculated. Then, in Steps p25 through p27, as in 
Steps p15 through p17 above, the first bit allocation, 
according to the ratio of masking threshold to quantized 
noise MNRi(n), is performed, and then in Steps p28 and 
p29. as in Steps p1 8 and p1 9 above, the second bit alio- 

so cation, according to the power of quantized noise 
SNi(n). is performed. 

In this way, a bit allocation with high sound quality 
appropriate to the musical tones like that shown in Fig- 
ure 4 can be performed automatically, and deterioration 

55 of sound quafity, even with respect to musical tones not 
suited to the bit allocation according to the ratio of 
masking threshold to quantized noise MNRi(n), can be 
prevented. 
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The foregoing explains the case in which the per- 
centage x is calculated by means of equations (1) 
through (3) above (the preferred case), but the present 
invention need not be limited to this case. A similar 
effect ma/ be obtained by calculation of the percentage 5 
x by means of equations (1) and (3). 

The change of the minimum limit of audibility and/or 
masking characteristics shewn in Figures 1 and 3 above 
may also be applied to the bit allocations shown in Fig- 
ures 4 and 5. 10 

The concrete embodiments and examples of imple- 
mentation discussed in the foregoing detailed explana- 
tions of the present invention serve solely to illustrate 
the technical details of this invention, and the present 
invention should not be narrowly interpreted within the 15 
limits of such concrete examples, but rather may be 
applied in many variations without departing from the 
spirit of this invention and the scope of the patent claims 
set forth below. 

20 

Claims 

1. A method of encoding digital data comprising the 
steps of: 

25 

(a) converting the digital data into a frequency 
spectrum, and dividing the frequency spectrum 
into a plurality of frequency bands: 

(b) changing a minimum limit of audibility char- 
acteristic among aural-psychological charac- 30 
teristics so as to set a masking threshold; 

(c) finding a ratio of masking threshold to noise 
for each frequency band, based on power or 
energy of each frequency band; and 

(d) allocating a number of quantized bits to 35 
each frequency band in accordance with the 
ratio of masking threshold to noise. 

2. The method of encoding digital data according to 
Claim 1. wherein: *o 

in said steps (c) and (d), the number of the 
quantized bits is increased incrementally from 
0, and, at each increase, the ratio of masking 
threshold to noise is found for each frequency 45 
band, and bits are allocated to a frequency 
band having the smallest ratio of masking 
threshold to noise. 

3. A method of encoding digital data comprising the so 
steps of: 

(a) converting the dgrtal data into a frequency 
spectrum, and dividing the frequency spectrum 
into a plurality of frequency bands; 55 

(b) changing a masking characteristic among 
aural-psychological characteristics so as to set 
a masking threshold; 



(c) finding the ratio of masking threshold to 
noise for each frequency band, based on 
power or energy of each frequency band; and 

(d) allocating a number of quantized bits to 
each frequency band in accordance with the 
ratio of masking threshold to noise. 

4. The method of encoding digital data according to 
Claim 3, wherein: 

in said step (b), after change of the masking 
characteristic, the minimum limit of audibility 
characteristic among aural-psychological char- 
acteristics is changed so as to set the masking 
threshold. 

5. The method of encoding digital data according to 
Claim 3, wherein: 

in said steps (c) and (d), the number of the 
quantized bits is increased incrementally from 
0, and, at each increase, the ratio of masking 
threshold to noise is found for each frequency 
band, and bits are allocated to a frequency 
band having the smallest ratio of masking 
threshold to noise. 

6. The method of encoding digital data according to 
Claim 4, wherein: 

in said steps (c) and (d), the number of the 
quantized bits is increased incrementally from 
0, and, at each increase, the ratio of masking 
threshold to noise is found for each frequency 
band, and bits are allocated to a frequency 
band having the smallest ratio of masking 
threshold to noise 

7. A method of encoding digital data in which the dig- 
ital data is converted into a frequency spectrum, the 
frequency spectrum is divided into frequency 
bands, and quantized bits are allocated to each fre- 
quency band, said method comprising the steps of: 

setting a percentage for allocation of the quan- 
tized bits, the percentage being changeable; 
and 

(0 performing a first allocation of the quan- 
tized bits in accordance with ratios of 
masking threshold to noise which are 
found for each frequency band in accord- 
ance with power or energy of each fre- 
quency band in consideration of aural- 
psychological characteristics; and 
(i) performing a second allocation of the 
quantized bits in accordance with a repre- 
sentative value of the power or energy of 
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each frequency band; 

wherein said steps (i) and (if") are per- 
formed in accordance with the percentage so 
as to allocate the total number of quantized s 
bits, thereby allocating the number of the quan- 
tized bits of each frequency band. 

The method of encoding digital data according to 
Claim 7, wherein: w 



formed in accordance with a ratio of masking 
threshold to noise of each frequency band in con- 
sideration of aural-psychological characteristics, 
the ratio being calculated for each frequency band 
in accordance with power or energy of each fre- 
quency band, wherein: 

masking characteristics that have been previ- 
ously found in accordance with the power or 
the energy are changeable. 



the percentage is determined in accordance 
with a relationship between the masking 
threshold and peaks and local peaks found 
based on differences in power or energy is 
between adjacent spectra within each fre- 
quency band. 

9. The method of encoding digital data according to 
Claim 7, wherein: so 

the percentage corresponds, if NM is the total 
number of the local peaks, with a ratio of the 
number M of the local peaks which are masked 
by the masking threshold to the number N of 25 
the local peaks which are not masked by the 
masking threshold. 

10. The method of encoding digital data according to 
Claim 9, wherein: 30 

the percentage is set to 0% when 
M/(NM+1) = 0 is satisfied; and 
the percentage is set to 100% when 
0.5 < M/(NM+1) is satisfied. 35 

1 1 . A method of encoding digital data in which, in con- 
verting digital data such as musical tones and 
sounds into frequency domains, dividing the con- 
verted spectra into a plurality of frequency bands, 40 
and allocating bits for each frequency band so as to 
encode the digital data, the allocating of bits is per- 
formed in accordance with a ratio of masking 
threshold to noise of each frequency band in con- 
sideration of aural-psychological characteristics, 45 
the ratio being calculated for each frequency band 

in accordance with power or energy of each fre- 
quency band, wherein: 



13. A method of encoding digital in which digital data 
such as musical tones and sounds is converted into 
frequency domains, the converted spectra are 
divided into a plurality of frequency bands, and bits 
are allocated for each frequency band so as to per- 
form the encoding, said method comprising the 
steps of: 

(i) performing a first allocation of the quantized 
bits in accordance with ratios of masking 
threshold to noise which are found for each fre- 
quency band in accordance with power or 
energy of each frequency band; 

(ii) performing a second allocation of the quan- 
tized bits in accordance with a representative 
value of the power or the energy of each fre- 
quency band; and 

(iii) performing a third allocation of the quan- 
tized bits giving weight to the bit allocation 
methods of each of said steps (i) and (ii); 

wherein said steps (i). (ii), and (iii) are 
switchable. 

14. The method of encoding digital data according to 
Claim 13, wherein: 

said steps (i), (ii). and (iii) are switched in 
accordance with a relationship between the 
masking threshold and peaks and local peaks 
found based on differences in power or energy 
between adjacent spectra within each fre- 
quency band. 



minimum limit of audibility characteristics that so 
have been previously found are changeable. 

1 2. A method of encoding digital data in which, in con- 
verting digital data such as musical tones and 
sounds into frequency domains, dividing the con- ss 
verted spectra into a plurality of frequency bands, 
and allocating bits for each frequency band so as to 
encode the digital data, the allocating of bits is per- 
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